Using Wide Range of Features for Author profiling
نویسندگان
چکیده
Predicting an author’s age, gender and personality traits by analyzing his/her documents is important in forensics, marketing and resolving authorship disputes. Our system combines different styles, lexicons, topics, familial tokens and different categories of character n-grams as features to build a logistic regression model for four different languages: English, Spanish, Italian and Dutch. With this model, we obtained global ranking scores of 0.6623, 0.6547, 0.7411, 0.7662 for English, Spanish, Italian and Dutch languages respectively.
منابع مشابه
A Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملAuthor gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملPsychographic Profiling of Indian Young Adult Consumers of Smartphone - VALS Approach
The current youth market is characterized as tech-savvy, variety seekers who has been active in using digital technology in unprecedented ways. The market segment defined here comprises of youth adults between the ages 20 to 30, who are more comfortable with purchasing the smartphones than previous generation. There is fierce competition in smartphone market, due to the large selection of devic...
متن کاملSpecies Specific DNA Profiling Mycobacterial Genomes Using Polymerase Chain Reaction with Single Universal Primer (UP-PCR)
Three tuberculous, twenty-one non-tuberculous mycobacterial (NTM) reference strains and seventy two isolates classified by biochemical tests were shown to produce specific sets of DNA fragments in a polymerase chain reaction with single universal primer (UP-PCR). A rather wide limit of tolerance for variations in procedure of PCR mixture preparation and thermocycling parameters was found. There...
متن کاملA periodic folded piezoelectric beam for efficient vibration energy harvesting
Periodic piezoelectric beams have been used for broadband vibration energy harvesting in recent years. In this paper, a periodic folded piezoelectric beam (PFPB) is introduced. The PFPB has special features that distinguish it from other periodic piezoelectric beams. The Adomian decomposition method (ADM) is used to calculate the first two band gaps andtwelve natural frequencies of the PF...
متن کامل